Lightweight predicate extraction for patient-level cancer information and ontology development
نویسندگان
چکیده
BACKGROUND Knowledge engineering for ontological knowledgebases is resource and time intensive. To alleviate these issues, especially for novices, automated tools from the natural language domain can assist in the development process of ontologies. We focus towards the development of ontologies for the public health domain and use patient-centric sources from MedlinePlus related to HPV-causing cancers. METHODS This paper demonstrates the use of a lightweight open information extraction (OIE) tool to derive accurate knowledge triples that can lead to the seeding of an ontological knowledgebase. We developed a custom application, which interfaced with an information extraction software library, to help facilitate the tasks towards producing knowledge triples from textual sources. RESULTS The results of our efforts generated accurate extractions ranging from 80-89% precision. These triples can later be transformed to OWL/RDF representation for our planned ontological knowledgebase. CONCLUSIONS OIE delivers an effective and accessible method towards the development ontologies.
منابع مشابه
Presenting a method for extracting structured domain-dependent information from Farsi Web pages
Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...
متن کاملAn eXtreme method for developing lightweight ontologies
In this paper we propose a method for effective building of lightweight ontologies, applying the principles of eXtreme Programming. The method is based on a multi-layered approach, which combines the advantages of maximum monotonic extensibility with clarity of the desired terms. The method aims to make it easy to develop a simple ontology, being careful to avoid deviation between the user's ex...
متن کاملAutomatic Predicate Argument Structure Analysis of the Penn Chinese Treebank
Recent work in machine translation and information extraction has demonstrated the utility of a level that represents the predicate-argument structure. It would be especially useful for machine translation to have two such Proposition Banks, one for each language under consideration. A Proposition Bank for English has been developed over the last few years, and we describe here our development ...
متن کاملLinguistically Light Lexical Extensions for Ontologies
An increasing number of enterprises are beginning to include semantic web ontologies into their Information Extraction (IE) and Text Analytics (TA) applications. This can be challenging for a TA group wishing to avail of semantic web ontologies due to the manual effort of retargeting and tailoring language resources within the TA system to a new domain to meet customer needs. A lightweight lexi...
متن کاملA Lightweight Ontology Approach to Scalable Interoperability
There are many different kinds of ontologies used for different purposes in modern computing. Lightweight ontologies are easy to create, but difficult to deploy; formal ontolgies are relatively easy to deploy, but difficult to create. This paper presents an approach that combines the strengths and avoids the weaknesses of lightweight and formal ontologies. In this approach, the ontology include...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 17 شماره
صفحات -
تاریخ انتشار 2017